Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump tesseract.js from 4.1.4 to 5.0.0 #98

Merged
merged 1 commit into from
Sep 30, 2023

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Sep 28, 2023

Bumps tesseract.js from 4.1.4 to 5.0.0.

Release notes

Sourced from tesseract.js's releases.

v5.0.0

What's Changed

Major New Features

  1. Significantly smaller file sizes
    1. 54% smaller file sizes for English, 73% smaller for Chinese (see #806 for details)
    2. This results in a ~50% decrease in runtime for first-time users (who do not yet have the data downloaded/cached)
  2. Significantly lower memory usage
    1. Worker memory utilization in the web benchmark is reduced from 311 MB to 164 MB (47% reduction)
    2. The lower memory footprint makes it feasible to use more workers, significantly improving performance for projects that utilize schedulers for parallel processing
  3. Compatible with iOS 17 (using default settings)
    1. iOS 17 broke compatibility with Tesseract.js v4--upgrading to v5 should resolve
      1. See discussion section below for details

Breaking Changes Impacting Many Users

  1. createWorker arguments changed
    1. Setting non-default language and OEM now happens in createWorker
      1. E.g. createWorker("chi_sim", 1)
  2. worker.initialize and worker.loadLanguage functions now do nothing and can be deleted from code
    1. Loading the language and initialization now occurs in createWorker
    2. Workers can be re-initialized with different settings using worker.reinitialize

In other words, code should be modified from this:

const worker = await Tesseract.createWorker();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const ret = await worker.recognize(file);

To this:

const worker = await Tesseract.createWorker("eng");
const ret = await worker.recognize(file);

Breaking Changes Impacting Fewer Users

  1. Users who manually set corePath will need to update the contents of their corePath directory
    1. corePath should point to a directory that contains all 4 of the files below from Tesseract.js-core v5:
      1. tesseract-core.wasm.js
      2. tesseract-core-simd.wasm.js
      3. tesseract-core-lstm.wasm.js
      4. tesseract-core-simd-lstm.wasm.js
    2. Tesseract.js will automatically select the correct version to use
  2. worker.detect function disabled by default
    1. Orientation + script detection is a function of the Legacy model only, which is no longer included by default
    2. To enable, set arguments legacyCore: true and legacyLang: true in createWorker options
      1. E.g. Tesseract.createWorker("eng", 1, {legacyCore: true, legacyLang: true});

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [tesseract.js](https://github.com/naptha/tesseract.js) from 4.1.4 to 5.0.0.
- [Release notes](https://github.com/naptha/tesseract.js/releases)
- [Commits](naptha/tesseract.js@v4.1.4...v5.0.0)

---
updated-dependencies:
- dependency-name: tesseract.js
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Sep 28, 2023
@RezkyRizaldi RezkyRizaldi merged commit ab69559 into main Sep 30, 2023
5 checks passed
@dependabot dependabot bot deleted the dependabot/npm_and_yarn/tesseract.js-5.0.0 branch September 30, 2023 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file size/XS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant